Architectural Support to Exploit Commutativity in Shared-Memory Systems
نویسنده
چکیده
Parallel systems are limited by the high costs of communication and synchronization. Exploiting commutativity has historically been a fruitful avenue to reduce traffic and serialization. This is because commutative operations produce the same final result regardless of the order they are performed in, and therefore can be processed concurrently and without communication. Unfortunately, software techniques that exploit commutativity, such as privatization and semantic locking, incur high runtime overheads. These overheads offset the benefit and thereby limit the applicability of software techniques. To avoid high overheads, it would be ideal to exploit commutativity in hardware. In fact, hardware already provides much of the functionality that is required to support commutativity. For instance, private caches can buffer and coalesce multiple updates. However, current memory hierarchies can understand only reads and writes, which prevents hardware from recognizing and accelerating commutative operations. The key insight this thesis develops is that, with minor hardware modifications and minimal extra complexity, cache coherence protocols, the key component of communication and synchronization in shared-memory systems, can be extended to allow local and concurrent commutative operations. This thesis presents two techniques that leverage this insight to exploit commutativity in hardware. First, COUP provides architectural support for a limited number of single-instruction commutative updates, such as addition and bitwise logical operations. COUP allows multiple private caches to simultaneously hold update-only permission to the same cache line. Caches with update-only permission can locally buffer and coalesce updates to the line, but cannot satisfy read requests. Upon a read request, COUP reduces the partial updates buffered in private caches to produce the final value. Second, COMMTM is a commutativity-aware hardware transactional memory (HTM) that supports an even broader range of multi-instruction, semantically commutative operations, such as set insertions and ordered puts. COMMTM extends the coherence protocol with a reducible state tagged with a userdefined label. Multiple caches can hold a given line in the reducible state with the same label, and transactions can implement arbitrary user-defined commutative operations through labeled loads and stores. These commutative operations proceed concurrently, without triggering conflicts or incurring any communication. A non-commutative operation (e.g., a conventional load or store) triggers a user-defined reduction that merges the different cache lines and may abort transactions with outstanding reducible updates. COUP and COMMTM reduce communication and synchronization in many challenging parallel workloads. At 128 cores, COUP accelerates state-of-the-art implementations of update-heavy algorithms by up to 2.4×, and COMMTM outperforms a conventional eager-lazy HTM by up to 3.4× and reduces or eliminates wasted work due to transactional aborts. Thesis Supervisor: Daniel Sanchez Title: Assistant Professor
منابع مشابه
Classification of Architectural Styles based on the Dimensions of the Integration of Hospital Information Systems
Introduction: Hospital information system (HIS) is a comprehensive software for integrating patient information for sending and exchanging health information between wards and other medical centers in order to accelerate the process of patient care and treatment, improve quality, and increase patient satisfaction. The advent of diverse and heterogeneous health care information systems in the fi...
متن کاملClassification of Architectural Styles based on the Dimensions of the Integration of Hospital Information Systems
Introduction: Hospital information system (HIS) is a comprehensive software for integrating patient information for sending and exchanging health information between wards and other medical centers in order to accelerate the process of patient care and treatment, improve quality, and increase patient satisfaction. The advent of diverse and heterogeneous health care information systems in the fi...
متن کاملArchitectural Support for Commutativity in Hardware Speculation
Hardware speculative execution schemes (e.g., hardware transactional memory (HTM)) enjoy low run-time overheads but suffer from limited concurrency because they detect conflicts at the level of reads and writes. By contrast, software speculation schemes can reduce conflicts by exploiting that many operations on shared data are semantically commutative: they produce semantically equivalent resul...
متن کاملCollective Memory as a Measure to Evaluate the Infill Architecture Innovations in Historic Contexts (Case Study: Historic Context of Imamzadeh Yahya in Tehran)
Historic contexts remind us of an era when cities were built based on the needs, goals, and preferences of their inhabitants. In other words, the mental world of both the builders and the inhabitants was closely interrelated. But by ignoring citizens' memories and interests and their mental needs, today's interventions with rapid developments within historic contexts have led to amnesia and the...
متن کاملTo appear in 25th Annual International Symposium on Computer Architecture Analytic Evaluation of Shared-Memory Systems with ILP Processors
This paper develops and validates an analytical model for evaluating various types of architectural alternatives for shared-memory systems with processors that aggressively exploit instruction-level parallelism. Compared to simulation, the analytical model is many orders of magnitude faster to solve, yielding highly accurate system performance estimates in seconds. The model input parameters ch...
متن کامل